Learning Corpus-Invariant Discriminant Feature Representations for Speech Emotion Recognition

نویسندگان

Peng Song

Shifeng Ou

Zhenbin Du

Yanyan Guo

Wenming Ma

Jinglei Liu

Wenming Zheng

چکیده

As a hot topic of speech signal processing, speech emotion recognition methods have been developed rapidly in recent years. Some satisfactory results have been achieved. However, it should be noted that most of these methods are trained and evaluated on the same corpus. In reality, the training data and testing data are often collected from different corpora, and the feature distributions of different datasets often follow different distributions. These discrepancies will greatly affect the recognition performance. To tackle this problem, a novel corpus-invariant discriminant feature representation algorithm, called transfer discriminant analysis (TDA), is presented for speech emotion recognition. The basic idea of TDA is to integrate the kernel LDA algorithm and the similarity measurement of distributions into one objective function. Experimental results under the cross-corpus conditions show that our proposed method can significantly improve the recognition rates. key words: speech emotion recognition, transfer learning, dimensionality reduction

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Dimensionality reduction for speech emotion features by multiscale kernels

To achieve efficient and compact low-dimensional features for speech emotion recognition, this paper proposes a novel feature reduction method using multiscale kernels in the framework of graph embedding. With Fisher discriminant embedding graph, multiscale Gaussian kernels are used in constructing optimal linear combination of Gram matrices for multiple kernel learning. To evaluate the propose...

متن کامل

Dimension Reduction and Discriminant Analysis for Japanese Connected Vowel Recognition

The aim of speech recognition is to extract only the linguistic information from speech signals. The acoustic variations caused by non-linguistic factors, such as speaker, communication channel and noise, pose a challenging problem for speech recognition. The same text can lead to different acoustic observations due to different speakers and different environments. To deal with these variations...

متن کامل

Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...

متن کامل

Towards Speech Emotion Recognition "in the Wild" Using Aggregated Corpora and Deep Multi-Task Learning

One of the challenges in Speech Emotion Recognition (SER) “in the wild” is the large mismatch between training and test data (e.g. speakers and tasks). In order to improve the generalisation capabilities of the emotion models, we propose to use Multi-Task Learning (MTL) and use gender and naturalness as auxiliary tasks in deep neural networks. This method was evaluated in within-corpus and vari...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

IEICE Transactions

دوره 100-D شماره

صفحات -

تاریخ انتشار 2017

Learning Corpus-Invariant Discriminant Feature Representations for Speech Emotion Recognition

نویسندگان

چکیده

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Dimensionality reduction for speech emotion features by multiscale kernels

Dimension Reduction and Discriminant Analysis for Japanese Connected Vowel Recognition

Classification of emotional speech using spectral pattern features

Towards Speech Emotion Recognition "in the Wild" Using Aggregated Corpora and Deep Multi-Task Learning

عنوان ژورنال:

اشتراک گذاری